Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More general structure for loop parsing #91

Merged
merged 8 commits into from
Nov 29, 2024

Conversation

dop-amin
Copy link
Collaborator

This PR adds a more generic strucutre to deal with loops: Subclasses of Loop implement a specific type of loop, for exmaple having a certain sequence of instructions at the end. The extraction works the same and is thus implemented in Loop, while the methods to produce the code differ.

I am open to suggestions for how to improve this approach; for my usecases wrt. armv7m it worked fine like this.

reg1 = p.group("reg1")
imm = p.group("imm")
state = 2
if additional_data is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say what the intended logic is here? Are you expecting this to be set by the first end-regexp?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the first end-regexp sets additional_data. I made this choice because that's usually the subs or cmp in which we can see the counter register and/or decrement.
I realize that maybe I want to get additional_data for each instruction in the "end"-part and then merge the dictionaries in the end. I'll change that.

for loop_type in Loop.__subclasses__():
try:
l = loop_type(lbl)
return l._extract(source, lbl) + (l,)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ... + (l,) may merit a comment. What's happening here? While you are at it, could you document _extract() the return tuple given by _extract() in a brief docstring?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. (l,) transforms l into a tuple and merges it with the return-tuple from _extract.

raise FatalParsingException(f"Couldn't identify loop {lbl}")

class LeLoop(Loop):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Add a docstring giving an example for the type of loop recognized here?

Copy link
Collaborator

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @dop-amin, overall this looks good -- better flexibility for loop forms has been an embarrassing gap for a while.

While you're at it, could you improve the doc a bit, and hoist the abstract Loop class somewhere where it can be shared between architecture models?

Also, would you mind adding minimal examples to examples.py which demonstrate each loop form, so we run them in CI?

slothy/core/slothy.py Outdated Show resolved Hide resolved
@dop-amin
Copy link
Collaborator Author

dop-amin commented Nov 21, 2024

@hanno-becker I think the PR now is a good starting point for abstraction of loop handling. However, I think there may pop up new cases in the future which will require slight tweaks, e.g., passing more/different inputs to the loop subclasses. I already pass some data where I know it will be useful from our experiments with Armv7m, esp. for more complicated loop constructions that check against a pointer that is modified inside the kernel.

Are there any tests you'd like me to run except CI? I fully optimized one aarch64 example already and the output code still passes the test.

@dop-amin dop-amin marked this pull request as ready for review November 21, 2024 10:48
@dop-amin dop-amin mentioned this pull request Nov 22, 2024
@hanno-becker hanno-becker self-requested a review November 25, 2024 06:38
@dop-amin
Copy link
Collaborator Author

Thank you @dop-amin, overall this looks good -- better flexibility for loop forms has been an embarrassing gap for a while.

While you're at it, could you improve the doc a bit, and hoist the abstract Loop class somewhere where it can be shared between architecture models?

Also, would you mind adding minimal examples to examples.py which demonstrate each loop form, so we run them in CI?

I think this should all be done by now.

```
loop_lbl:
{code}
le <cnt>, loop_lbl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is assuming that cnt increases by 1 per iteration, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be broken through SW pipelining, or not, if the loop count increment is chosen as an early instruction?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, my fault -- this is already part of LE, I had forgotten.

"""Locate a loop with start label `lbl` in `source`.

We currently only support the following loop forms:
yield f"{indent}sub {other['cnt']}, {other['cnt']}, {other['imm']}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use the same format that the original loop had? With/without flag, and potentially using cbnz, bnz, bne?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pre-existing, so let's not block the PR because of it.

@hanno-becker hanno-becker merged commit 90a78c0 into slothy-optimizer:main Nov 29, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants